Practice with Tidyverse

# load in packages
library(tidyverse)
## Warning: package 'ggplot2' was built under R version 4.2.3
## Warning: package 'tibble' was built under R version 4.2.3
## Warning: package 'dplyr' was built under R version 4.2.3
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.2     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.2     ✔ tibble    3.2.1
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

Fixing Bad Object names

mfn <- 12
'Saras fave food' <- "pizza"
passcode1 <- "password12345678"

Fixing bad Code

Let’s say we had some points data, and wanted to find the distance

set.seed(123)
points <- data.frame(
  x = rnorm(10),
  y = rnorm(10)
)

Bad #1

# Nested function calculation
sqrt(sum(abs(mean(points$x))))
## [1] 0.2731769

Bad #2

# using objects
mean <- mean(points$x)
abs_mean <- abs(mean)
sum_abs <- sum(abs_mean)
sqrt_sum <- sqrt(sum_abs)

Try to use pipes in tidyverse

# tidyverse pipes %>%
points$x %>% 
  mean() %>% 
  abs() %>% 
  sum() %>%  
  sqrt()
## [1] 0.2731769
points %>% 
  select(x) %>% 
  mean() %>% 
  abs() %>% 
  sum() %>%  
  sqrt()
## Warning in mean.default(.): argument is not numeric or logical: returning NA
## [1] NA
points$x * points$y  %>% 
  sum()
##  [1] -1.1692753 -0.4802008  3.2518078  0.1470960  0.2697226  3.5780022
##  [7]  0.9615724 -2.6391956 -1.4329259 -0.9297487

Practice Reading in Data

# session ->
# set working directory ->
# to source file location
# need data in same location as your rmd file

#setwd(path) or run code

Three Types of Data

toenail.txt Text File

student_performance_data.csv Comma Separated Variable File

retail.xlsx Microsoft Excel Worksheet

Reading CSV

# all from readr package which we get with tidyverse

Reading Text

Reading Microsoft Excel Worksheet

  • Look at the data file
  • What do we see?
  • Multiple Pages